Multirelational Consensus Clustering with Nonnegative Decompositions

نویسنده

  • Liviu Badea
چکیده

Unsupervised multirelational learning (clustering) in non-sparse domains such as molecular biology is especially difficult as most clustering algorithms tend to produce distinct clusters in slightly different runs (either with different initializations or with slightly different training data). In this paper we develop a {\em multirelational consensus clustering\/} algorithm based on nonnegative decompositions, which are known to produce sparser and more interpretable clusterings than other data-oriented algorithms. We apply this algorithm to the joint analysis of the largest available gene expression datasets for leukemia and respectively normal hematopoiesis in order to develop a more comprehensive genomic characterization of the heterogeneity of leukemia in terms of 38 normal hematopoietic cell states. Surprisingly, we find unusually complex expression programs involving large numbers of transcription factors, whose further in-depth analysis may help develop personalized therapies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tensor Decompositions: A New Concept in Brain Data Analysis?

Matrix factorizations and their extensions to tensor factorizations and decompositions have become prominent techniques for linear and multilinear blind source separation (BSS), especially multiway Independent Component Analysis (ICA), Nonnegative Matrix and Tensor Factorization (NMF/NTF), Smooth Component Analysis (SmoCA) and Sparse Component Analysis (SCA). Moreover, tensor decompositions hav...

متن کامل

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

Nonnegative Ranks, Decompositions, and Factorizations of Nonnegative Matrices

The nonnegative rank of a nonnegative matrix is the smallest number of nonnegative rank-one matrices into which the matrix can be decomposed additively. Such decompositions are useful in diverse scientific disciplines. We obtain characterizations and bounds and show that the nonnegative rank can be computed exactly over the reals by a finite algorithm.

متن کامل

Nonnegative Matrix Factorizations for Clustering: A Survey

Recently there has been significant development in the use of non-negative matrix factorization (NMF) methods for various clustering tasks. NMF factorizes an input nonnegative matrix into two nonnegative matrices of lower rank. Although NMF can be used for conventional data analysis, the recent overwhelming interest in NMF is due to the newly discovered ability of NMF to solve challenging data ...

متن کامل

Clustering and Metaclustering with Nonnegative Matrix Decompositions

Although very widely used in unsupervised data mining, most clustering methods are affected by the instability of the resulting clusters w.r.t. the initialization of the algorithm (as e.g. in k-means). Here we show that this problem can be elegantly and efficiently tackled by meta-clustering the clusters produced in several different runs of the algorithm, especially if “soft” clustering algori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012